192 research outputs found

    Scalable Distributed DNN Training using TensorFlow and CUDA-Aware MPI: Characterization, Designs, and Performance Evaluation

    Full text link
    TensorFlow has been the most widely adopted Machine/Deep Learning framework. However, little exists in the literature that provides a thorough understanding of the capabilities which TensorFlow offers for the distributed training of large ML/DL models that need computation and communication at scale. Most commonly used distributed training approaches for TF can be categorized as follows: 1) Google Remote Procedure Call (gRPC), 2) gRPC+X: X=(InfiniBand Verbs, Message Passing Interface, and GPUDirect RDMA), and 3) No-gRPC: Baidu Allreduce with MPI, Horovod with MPI, and Horovod with NVIDIA NCCL. In this paper, we provide an in-depth performance characterization and analysis of these distributed training approaches on various GPU clusters including the Piz Daint system (6 on Top500). We perform experiments to gain novel insights along the following vectors: 1) Application-level scalability of DNN training, 2) Effect of Batch Size on scaling efficiency, 3) Impact of the MPI library used for no-gRPC approaches, and 4) Type and size of DNN architectures. Based on these experiments, we present two key insights: 1) Overall, No-gRPC designs achieve better performance compared to gRPC-based approaches for most configurations, and 2) The performance of No-gRPC is heavily influenced by the gradient aggregation using Allreduce. Finally, we propose a truly CUDA-Aware MPI Allreduce design that exploits CUDA kernels and pointer caching to perform large reductions efficiently. Our proposed designs offer 5-17X better performance than NCCL2 for small and medium messages, and reduces latency by 29% for large messages. The proposed optimizations help Horovod-MPI to achieve approximately 90% scaling efficiency for ResNet-50 training on 64 GPUs. Further, Horovod-MPI achieves 1.8X and 3.2X higher throughput than the native gRPC method for ResNet-50 and MobileNet, respectively, on the Piz Daint cluster.Comment: 10 pages, 9 figures, submitted to IEEE IPDPS 2019 for peer-revie

    Optimized Broadcast for Deep Learning Workloads on Dense-GPU InfiniBand Clusters: MPI or NCCL?

    Full text link
    Dense Multi-GPU systems have recently gained a lot of attention in the HPC arena. Traditionally, MPI runtimes have been primarily designed for clusters with a large number of nodes. However, with the advent of MPI+CUDA applications and CUDA-Aware MPI runtimes like MVAPICH2 and OpenMPI, it has become important to address efficient communication schemes for such dense Multi-GPU nodes. This coupled with new application workloads brought forward by Deep Learning frameworks like Caffe and Microsoft CNTK pose additional design constraints due to very large message communication of GPU buffers during the training phase. In this context, special-purpose libraries like NVIDIA NCCL have been proposed for GPU-based collective communication on dense GPU systems. In this paper, we propose a pipelined chain (ring) design for the MPI_Bcast collective operation along with an enhanced collective tuning framework in MVAPICH2-GDR that enables efficient intra-/inter-node multi-GPU communication. We present an in-depth performance landscape for the proposed MPI_Bcast schemes along with a comparative analysis of NVIDIA NCCL Broadcast and NCCL-based MPI_Bcast. The proposed designs for MVAPICH2-GDR enable up to 14X and 16.6X improvement, compared to NCCL-based solutions, for intra- and inter-node broadcast latency, respectively. In addition, the proposed designs provide up to 7% improvement over NCCL-based solutions for data parallel training of the VGG network on 128 GPUs using Microsoft CNTK.Comment: 8 pages, 3 figure

    Association between health examination items and body mass index among school children in Hualien, Taiwan

    Get PDF
    BACKGROUND: To assess the prevalence of obesity and major physical examination items including dental caries, myopia, pinworm, hematuria, and proteinuria among school children in Hualien, Taiwan. In addition, the health status differences between gender, grader, levels of residence urbanization, and body mass index (BMI) were examined. METHODS: Cross-sectional studies with a total of 11,080 students (age, 7–14 years) in grades 1, 4, and 7 were evaluated for weight, height, routine physical examination, and urine analysis during the 2010 Student Health Examination in Hualien. Frequencies, Chi-square test, and logistic regression were conducted using SPSS. RESULTS: Of the 11,080 students evaluated, 1357 (12.2%) were overweight, and 1421 (12.8%) were obese. There were significant differences in overweight/obese prevalence by gender, by grader, and by levels of residence urbanization. Dental caries, myopia, and obesity were the most prevalent health problems among these students (75.6%, 33.0%, and 12.8%, respectively). In crude and adjusted analyses, research results showed that there were significant differences in the prevalence of major physical examination items between different gender, grader, levels of residence urbanization, and BMI groups. Girls had a higher prevalence of dental caries, myopia, and hematuria than boys (all p < 0.01), whereas boys had a higher prevalence of pinworm than girls (p = 0.02). Students in higher grades had significantly higher prevalence of myopia, hematuria, and proteinuria (all p < 0.01), whereas students in lower grades had higher prevalence of dental caries and pinworm (p < 0.01). Students with abnormal BMI had lower prevalence of pinworm (p < 0.01). Students residing in suburban and rural areas had higher prevalence of dental caries, pinworm, and hematuria (all p < 0.01), and lower prevalence of myopia than students residing in urban areas (all p < 0.01). CONCLUSION: Routine health examination provides an important way to detect students’ health problems. Our study elucidated major health problems among school children in Hualien, Taiwan. In addition, the results also indicated that the prevalence of health problems had a significant relationship with gender, grader, levels of residence urbanization, and BMI. It is suggested that school health interventions should consider students’ health profiles along with their risk factors status in planning

    High-Frequency Sea Level Variations Observed by GPS Buoys Using Precise Point Positioning Technique

    Full text link
    In this study, sea level variation observed by a 1-Hz Global Positioning System (GPS) buoy system is verified by comparing with tide gauge records and is decomposed to reveal high-frequency signals that cannot be detected from 6-minute tide gauge records. Compared to tide gauges traditionally used to monitor sea level changes and affected by land motion, GPS buoys provide high-frequency geocentric measurements of sea level variations. Data from five GPS buoy campaigns near a tide gauge at Anping, Tainan, Taiwan, were processed using the Precise Point Positioning (PPP) technique with four different satellite orbit products from the International GNSS Service (IGS). The GPS buoy data were also processed by a differential GPS (DGPS) method that needs an additional GPS receiver as a reference station and the accuracy of the solution depends on the baseline length. The computation shows the average Root Mean Square Error (RMSE) difference of the GPS buoy using DGPS and tide gauge records is around 3 - 5 cm. When using the aforementioned IGS orbit products for the buoy derived by PPP, its average RMSE differences are 5 - 8 cm, 8 - 13 cm, decimeter level, and decimeter-meter level, respectively, so the accuracy of the solution derived by PPP highly depends on the accuracy of IGS orbit products. Therefore, the result indicates that the accuracy of a GPS buoy using PPP has the potential to measure the sea surface variations to several cm. Finally, high-frequency sea level signals with periods of a few seconds to a day can be successfully detected in GPS buoy observations using the Ensemble Empirical Mode Decomposition (EMD) method and are identified as waves, meteotsunamis, and tides

    Electroconvulsive Therapy and Risk of Dementia—A Nationwide Cohort Study in Taiwan

    Get PDF
    Background: Electroconvulsive therapy (ECT) is an effective treatment for schizophrenia, bipolar disorder, and major depressive disorder, and a temporary memory loss may occur after ECT. However, the association between ECT in patients with schizophrenia, bipolar disorder, and major depressive disorder, and the risk of dementia is yet to be examined.Objective: This study aimed to clarify as to whether ECT is associated with the risk of dementia after ECT in patients with schizophrenia, bipolar disorder, and major depressive disorder, using Taiwan's National Health Insurance Research Database (NHIRD).Methods: A total of 3,796 enrolled participants (schizophrenia, 46.68%; bipolar disorder, 11.77%; and major depressive disorder, 41.55%) with 994 patients who had received ECT and 2,982 controls matched for sex and age, between January 1, and December 31, 2000, were selected from the NHIRD. After adjusting for confounding factors, Fine and Gray's survival analysis was used to compare the risk of developing dementia during the 10 years of follow-up.Results: Of the study patients, 45 (4.53%) of them developed dementia when compared to 149 (5.0%) in the control group. Fine and Gray's survival analysis revealed that the study patients were not associated with an increased risk of dementia [hazard ratio (HR) = 0.612, 95% confidence interval (CI) = 0.438–1.854, P = 0.325]. After adjusting for sex, age, monthly income, urbanization level, geographic region, and comorbidities, the adjusted HR was 0.633 (95% CI = 0.448 – 1.895, P = 0.304).Conclusion: This study supports that ECT was not associated with the increased risk of dementia in patients with schizophrenia, bipolar disorder, and major depressive disorder, using the NHIRD

    Assessing the Decision-Making Process in Human-Robot Collaboration Using a Lego-like EEG Headset

    Get PDF
    Human-robot collaboration (HRC) has become an emerging field, where the use of a robotic agent has been shifted from a supportive machine to a decision-making collaborator. A variety of factors can influence the effectiveness of decision-making processes during HRC, including the system-related (e.g., robot capability) and human-related (e.g., individual knowledgeability) factors. As a variety of contextual factors can significantly impact the human-robot decision-making process in collaborative contexts, the present study adopts a Lego-like EEG headset to collect and examine human brain activities and utilizes multiple questionnaires to evaluate participants’ cognitive perceptions toward the robot. A user study was conducted where two levels of robot capabilities (high vs. low) were manipulated to provide system recommendations. The participants were also identified into two groups based on their computational thinking (CT) ability. The EEG results revealed that different levels of CT abilities trigger different brainwaves, and the participants’ trust calibration of the robot also varies the resultant brain activities

    Fibrate and the risk of cardiovascular disease among moderate chronic kidney disease patients with primary hypertriglyceridemia

    Get PDF
    IntroductionHypertriglyceridemia is the most prevalent dyslipidemia in patients with chronic kidney disease (CKD). However, research about fibrate treatment in CKD patients is limited, and assessing its benefits becomes challenging due to the frequent concurrent use of statins. Thus, this study is aimed to investigate the role of fibrate in CKD stage 3 patients with hypertriglyceridemia who did not receive other lipid-lowering agents.MethodsThis study enrolled patients newly diagnosed CKD3 with LDL-C&lt;100mg/dL and had never received statin or other lipid-lowering agents from Chang Gung Research Database. The participants were categorized into 2 groups based on the use of fibrate: fibrate group and non-fibrate group (triglyceride &gt;200mg/dL but not receiving fibrate treatment). The inverse probability of treatment weighting was performed to balance baseline characteristics.ResultsCompared with the non-fibrate group (n=2020), the fibrate group (n=705) exhibited significantly lower risks of major adverse cardiac and cerebrovascular events (MACCEs) (10.4% vs. 12.8%, hazard ratios [HRs]: 0.69, 95% confidence interval [CI]: 0.50 to 0.95), AMI (2.3% vs. 3.9%, HR: 0.52, 95% CI: 0.37 to 0.73), and ischemic stroke (6.3% vs. 8.0%, HR: 0.67, 95% CI: 0.52 to 0.85). The risk of all-cause mortality (5.1% vs. 4.5%, HR: 1.09, 95% CI: 0.67 to 1.79) and death from CV (2.8% vs. 2.3%, HR: 1.07, 95% CI: 0.29 to 2.33) did not significantly differ between the 2 groups.ConclusionThis study suggests that, in moderate CKD patients with hypertriglyceridemia but LDL-C &lt; 100mg/dL who did not take other lipid-lowering agents, fibrates may be beneficial in reducing cardiovascular events
    corecore